fix: lifecycle & teardown correctness (PR1 of bug-audit-v2)#108
Merged
Conversation
Verify-then-extend pass over the 2026-05-31 audit: 26/29 prior findings FIXED, 3 PARTIAL. Adds 26 new findings under UX, logic, security, and tests, with file:line evidence and fix shapes. pip-audit and bandit run; results folded in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decomposes the 26 new audit findings into 3 themed PRs: lifecycle/teardown correctness, config UX + security validation, and hygiene + CI gate. Per-PR scope, locked decisions, and file lists included. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-task TDD steps covering all 10 sub-fixes (LOG-1..9, SEC-4) from the 2026-06-05 audit. LOG-1 corrected to docstring-only after verifying OTel's set_tracer_provider is set-once-enforced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bandit B101 flagged the assert as stripped under `python -O`. Replace with explicit raise so the invariant holds under all Python optimization levels. Ruff's TRY004 enforces TypeError for isinstance-guarded raises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…EC-4 part B) Bandit B101 flagged the assert as stripped under `python -O`. Replace with explicit raise. Resolves the second of the two bandit B101 findings in PR1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The OTel SDK enforces set_tracer_provider as set-once per process. teardown() calls shutdown() (added in CRIT-2) but cannot reset the process-global pointer. Document the supported lifecycle: one OpenTelemetryInstrument per process. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bootstrap() previously set opentelemetry.instrumentation.instrumentor and opentelemetry.trace loggers to disabled=True unconditionally, with no symmetric restoration. Capture the pre-bootstrap state and restore on teardown so unrelated code in the same process retains its expected logger configuration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (LOG-3) A raise from handler.close() mid-loop previously left remaining handlers attached, skipped the root-level reset, and never called close_handlers on the factory. Wrap in try/finally and aggregate close errors into TeardownError (matching BaseBootstrapper.teardown's pattern) so the rest of teardown completes before all collected errors are re-raised together. Factory close failures are included in the aggregation so they don't silently replace handler errors via Python's finally-replaces-pending semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rdown (LOG-4) bootstrap() mutates broker.config.logger.params_storage to inject a structlog logger; teardown() now captures and restores the original value. Symmetric with LoggingInstrument's parent teardown, which still runs via super(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bootstrap() called sentry_sdk.init() but no teardown reset the global state. Process-local tests had to call sentry_sdk.init() in finally blocks as a workaround; those are now replaced by instrument.teardown(). flush() drains in-flight events before the reset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous dict[int, ASGIApp] keyed by id() never evicted, holding wrapper OpenTelemetryMiddleware instances alive after Litestar dropped the next_app reference. Switch to weakref.WeakKeyDictionary so wrappers GC with their apps. Skip caching gracefully when an app doesn't support weak references. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…G-7) Constructing two FastMcpBootstrappers around the same application previously stacked two _TeardownProvider instances, doubling the teardown call on shutdown. Detect an existing _TeardownProvider in application.providers and skip re-attach with a warning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Constructing two FastAPIBootstrappers around the same application previously stacked two lifespan wrappers, calling teardown twice on shutdown. Detect the sentinel on application.state and skip re-wrap with a warning. Idempotent teardown (CRIT-3) means the prior behavior was harmless, but this removes the log noise and confusing stack depth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-quality review on PR1 Task 9 noted the warning text didn't make clear that the second FastMcpBootstrapper has no teardown wired up. Add that line. Tests still match on "_TeardownProvider" so no test change needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Jun 5, 2026
lesnik512
added a commit
that referenced
this pull request
Jun 5, 2026
Three PRs (#108, #109, #110), 26 findings closed, 153 → 187 tests, 100% coverage throughout. Captures what went well (per-PR sequencing, two-stage review catching three real bugs), what didn't (three implementer-hallucination incidents, cross-task __post_init__ cascade missed in PR2 plan, OTel API misread in audit, SEC-2 host parser untested against canonical localhost:4317 form), and six action items to harden the pipeline for the next audit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
lesnik512
added a commit
that referenced
this pull request
Jun 5, 2026
Bug-audit-v2 cycle: 26 findings across 4 PRs (#108-#111). Lifecycle hardening, config validation, CI gate, generalized TeardownError aggregation. Two behavior changes called out: FastAPIConfig no longer stomps user app fields; CorsConfig wildcard+credentials now raises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR1 of the 2026-06-05 bug-audit-v2 plan. Closes ten lifecycle / teardown findings — every fix is "bootstrap touched state, teardown didn't clean it up" or "constructing two bootstrappers around the same app caused stacking." Sequenced per
planning/specs/2026-06-05-bug-audit-v2-sequencing.md; per-task TDD plan atplanning/plans/2026-06-05-pr1-lifecycle.md; parent audit atplanning/specs/2026-06-05-bug-audit-v2.md.Findings closed
set_tracer_providerset-once constraint. The SDK enforces it via_TRACER_PROVIDER_SET_ONCE.do_once(...); the "global reset on teardown" fix in the audit isn't achievable without touching SDK internals, so this becomes a class docstring that names the constraint and the supported lifecycle.opentelemetry.instrumentation.instrumentor,opentelemetry.trace) to their pre-bootstrapdisabledstate on teardown. Capture-and-restore via a new_prior_logger_disabled: dict[str, bool]field.LoggingInstrument.teardown()no longer dies on ahandler.close()raise mid-loop. Errors are aggregated intoTeardownError(matchingBaseBootstrapper.teardown's pattern) so the rest of teardown — level reset, factory close — always runs. Factory-close errors are also caught and included in the aggregate so they don't silently replace handler errors viafinally-replaces-pending semantics.FastStreamLoggingInstrumentcapturesbroker.config.logger.params_storagein bootstrap and restores it in teardown. Symmetric to LOG-2 on the broker side.assertprecondition checks with explicit raises in_narrow_app(TypeError, per ruff's TRY004) andPyroscopeInstrument.bootstrap(RuntimeError, since the check isis Nonenotisinstance). Closes both bandit B101 findings.LitestarOpenTelemetryInstrumentationMiddleware._otel_appsfromdict[int, ASGIApp](id-keyed, never evicts) toweakref.WeakKeyDictionary[ASGIApp, ASGIApp]so wrapper apps GC with theirnext_app. Both.get()and__setitem__are wrapped incontextlib.suppress(TypeError)so non-weakrefable callables fall through unharmed.FastMcpBootstrapper.__init__now inspectsapplication.providersfor an existing_TeardownProviderand warns + skips re-attachment, preventing stacked teardown invocations when two bootstrappers are constructed around the same FastMCP app.FastAPIBootstrapper.__init__usesapplication._lite_bootstrap_lifespan_attached(private attribute on the FastAPI instance) as a sentinel to detect and warn on duplicate lifespan-wrap attempts. Uses the library-private-attribute convention rather than squatting in Starlette's user-facingapplication.statenamespace.SentryInstrument.teardown()callssentry_sdk.flush(timeout=2)thensentry_sdk.init()(no args) to disable further event capture. The two existing tests that calledsentry_sdk.init()infinallyas a workaround now useinstrument.teardown()instead.Implementation notes
raise close_errors[0](lost info on multiple failures and silent factory-close-error masking); rewritten to use the project's existingTeardownError(errors)pattern.__setitem__incontextlib.suppress;WeakKeyDictionary.get()also raisesTypeErroron non-weakrefable keys, so both call sites were wrapped and the test was rewritten to use a real non-weakrefable class (__slots__ = (), no__weakref__slot).application.state; moved to a_lite_bootstrap_lifespan_attacheddirect attribute matchingopentelemetry-instrumentation-fastapi's approach.TeardownErroraggregation pattern fromLoggingInstrumentto other instruments' teardown methods (currently onlyLoggingInstrumentandBaseBootstrapperuse it).README.md.Test plan
just test— 164 passed (10 new tests + 1 updated existing), 100% coverage.just lint-ci— clean (ruff format, ruff check, eof-fixer, ty).bandit -r lite_bootstrap— no issues (both prior B101 findings closed by LOG-5/SEC-4).🤖 Generated with Claude Code